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Abstract-Analysis of electricity price characteristics is crucial to 
understand the behaviour of derivatives pricing and to quantify 
the risk in electricity markets. In this paper, a multi-cycle 
GARCH-M model with skewed student-t distribution for 
residuals, which is solved by maximum likelihood estimation, is 
proposed. The model can explicitly address the relationship with 
system loads, heteroskedasticities, seasonalities, time-varying 
skewnesses and heavy-tails of electricity prices. The empirical 
analysis based on the historical data of the PJM electricity 
market shows that the conditional variance and system load have 
a significant effect on average daily electricity prices, and there 
are volatility clustering and weekly, semi-monthly, monthly, 
bimonthly, quarterly and semi-annual periods, and the variances 
and heavy-tails of electricity prices manifest clearly time-varying 
characteristics. The model holds parsimonious scale of estimated 
parameters, less computational costs, high practical application 
value, and it's easy to select the orders. 

Keywords-skewed student-t distribution; skewness and fat-tail; 
multi-cycle; volatility clustering; GARCH-M model 

I. INTRODUCTION 

With the rapid growth of derivative securities in electricity 
markets, the modelling and management of price risk have 
become important topics for researchers and practitioners. 
Electricity price is set by the market clearing price at which 
quantity supply is equal to quantity demand. It is influenced 
not only by the objective factors such as climates, loads, 
power generating costs, available generation capacities and 
network congestions, but also by the subjective factors such 
as market trading rules, participants' bidding strategies and 
their psychological reactions to price changes. All of these 
factors which make an accurate price forecast become a 
complex issue. The current methods to predict electricity price 
can be divided into long- and short-term forecasting methods 
[1]. The long-term electricity price forecasting can be 
achieved by simulating the competition rules, which mainly 
includes oligopoly equilibrium model, probabilistic 
production simulation and intelligent agent simulation. With 
the statistical analysis for historical data, the mathematical 
model to reflect price changes can be established and the 
short-term price forecasting can be obtained. The short-term 
forecasting methods mainly include time series analysis 
(TSA), artificial neural networks (ANNs) and grey system 
forecasting models. 

ANNs have been widely used for nonlinear multivariate 
problems because of the adaptive ability to uncertain fuzzy 
systems [2-10]. In [2,3], the electricity spot prices in Victoria, 
Spain and California markets were predicted by three layered 
feedforward ANNs based on the training methods of back 



propagation and Levenberg-Marquardt algorithms. In [4,5], 
the performance of Gaussian radial basis function ANNs 
(GRFANNs) and the traditional ANNs were compared 
indicting superior performance of the GRFANNs with faster 
learning speed and better approximation capability, and the 
GRFANNs are more suitable for short-term electricity price 
forecasts. In [6-10], the approaches of forecasting short-term 
electricity prices using combination of fuzzy logic, Kalman 
filter, support vector machines and ANNs were proposed 
respectively, the results show that significantly improved 
prediction performance can be achieved by using the hybrid 
forecasting methods. However, the lower learning speed and 
the parameters which are not flexible to be adjusted have 
impeded their application in practice. 

With relatively small historical data required, TSA can 
accurately reflect the continuous changes of the historical data. 
Autoregressive moving average (ARMA) and ARMA with 
exogenous variables (ARMAX) models are the commonly 
used TSA methods. In [11], an ARMAX model with load as 
an exogenous explanatory variable was used to predict the 
next 24-hour spot price in the PJM electricity markets. 
Considering the non-constant mean and variance for most 
electricity price time series, an electricity price forecasting 
method based on autoregressive integrated moving average 
(ARIMA) model has been proposed in [12], but the impacts of 
load and other factors were not considered. Cuaresma et al. 
[13] have noted that each hour of days is also an important 
factor to influence electricity prices and an ARMA-based 
period-decoupled forecasting model was proposed, showing 
greatly improved prediction accuracy for the price spikes. The 
electricity price forecasting methods which combine ARIMA 
with predicted errors improvement and wavelet transfer 
function were respectively proposed in [14] and [15]. A 
period-decoupled electricity price forecasting method based 
on transfer function models, taking the effect of load on 
electricity price and the non-stationary properties of price 
series into account, was presented in [16], and further 
improved the prediction accuracy. However, with the 
assumption that the price distribution is normal with constant 
variance, these models can not effectively deal with the 
seasonalities, heteroscedasticities, heavy-tails of electricity 
prices, and the more computational costs and estimated 
parameters have also impeded their heavy use in practice. 

Compared with normal distribution, there are positive 
skewness and excess kurtosis for electricity price series. In 
evaluating the price risk in electricity market, we should take 
the positive skewness and excess kurtosis into account. With 
comprehensive analysis of the basic features and influencing 
factors of electricity price, a multi-cycle GARCH-M model 
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with skewed student-t distribution for residuals (thereafter, st- 
GARCH-M) is proposed, in which the heteroscedasticities, 
seasonalities, kurtosises and heavy-tails, volatility-clustering 
and relationship to system loads are jointly addressed. The 
model holds the advantages of less computational cost and 
parsimonious scale of estimated parameters. The numerical 
example based on the historical data of the PJM market shows 
that the conditional variance and system load have significant 
effects on the average daily electricity prices, and there are 
volatility clustering and weekly, semi-monthly, quarterly and 
semi-annual periods, and the variance of electricity price 
series and the degree of freedom of skewed student-t 
distribution manifesting the clear time-varying characteristics. 

II. MODEL AND SOLUTION METHOD 

A. Multi-Cycle St-GARCH-M Model 

Electricity price forecasting model can be viewed as a 
multi-input single-output system, in which the output variable 
is the electricity price and the input variables are the impact 
factors of electricity price such as fuel prices, seasonality, 
climate, load and bidding strategies of market participants. 
Moreover, in this paper, the system will be delineated by a 
multi-cycle st-GARCH-M model. The market clearing prices 
and system loads are publicly available in each market all 
over the world. Therefore, the system loads at hours t, t-1, and 
the electricity prices at hours t-1, t-2, are selected as the input 

variables. Assuming that p t , d t , £ t and Z t denote the 

electricity spot price, the system load, the residual and the 
standardized residual at hour t, then the multi-cycle st- 
GARCH-M model can be formulated as: 

p, = f(t)+y(B)d t + <p(B)p, 

+ d(B)Jh~ t+K (B)s t 

e, = ylh t z t , 8,11,., ~B(Q,h t ),z t \l t _, ~Z>(0,1) 



(1) 
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where B is the backshift operator, m denotes the changing 
cycles of electricity price series per year, U , V , p and q 

represent respectively the lagged orders of d t , Jh t , p t and 

e t in the mean equation, r h and s h denote the lagged order 

of h t and E t in the conditional variance equation, I f _ { 

denotes an available information set till period t — \ , h t 

denotes conditional variance of £. , f(t) denotes the time 

trend and seasonal changes, d wkd is a dummy variable that 
takes a value of 1 if the observation is in weekday and zero 

«2») • 



y— C ; i»""»Xi), (p = ((p x ,---,(p p ), k = (/cj,---, K q ), 

= (0 1 ,-,0 V ) and B=(fi ,P n ,..,B Xri ,B 2x ,..,B 2sh ) are 

the estimated parameters. With this general formulation for 
the sinusoidal function we allow for the possibility of having 
many cycles per year, and the amplitude and location of the 

peak of the z'th cycle can be respectively captured by a h 
and a 2 , . /? > , B u ,B 2j > , Vl" e [l,r h ],j S [l,s h ] are 

needed to guarantee the strictly positive for the conditional 
variance and the process not to degenerate. 

B. Parameters Calibration 

Before parameters calibration, assumption on the 
distribution of residuals needs to be made. Assuming that the 
probability density function (PDF) for the standardized 

residual Z ( is consistent with skewed student-t distribution, 
the conditional PDF of £ t can be expressed as [17]: 




(2) 






>07,-2)r(^) 



where T is a Gamma function, A and rj t are the conditional 
skewness and degree of freedom corresponding to the skewed 
student-t distribution of Z t , respectively, and can be 
calculated by: 

U -L 

1 + exp(-co ( ) 



I =L,+- 



U,-L, 



(3) 



otherwise, a 



\CLq , O.^ , & 2 , $[ j , ■ 



l + exp(-T,) 

i=\ i=l 

where U and L denote the upper and lower limits of f] t , 

2 

r and s are respectively the lagged order of s t and e t in 
the conditional freedom degrees equation, XJ ■ and L x denote 
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the upper and lower limits of X. , r x and s x are respectively 



The Nyblom-statistic can be also used to test the 
constancy of a single estimated parameter. The Nyblom- 



the lagged order of e t and e t in the conditional skewness statistic w corresponding to the Mi estimated parameter is 



equation, 8 = (S ,S n ,--- ,S lr ^,S 2l ,--- , J 2 s ) and 

fi = (/J ,fJ ll ,--;fJ lr ,jX lx ,--;jX ls ) are the parameters to be 
estimated. 

Let £ — (a,<p,0,y,K, P,8,/J.) , the log-likelihood 
function for all observations corresponding to £ t is given by: 



17, +i 



In 



f b,c,^ 
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where l t (£) = hlg(s t \l l _ l ) is the log-likelihood function 
for one observation at period t. By maximizing theL(f) , the 

estimate values of parameters cf , cf can be obtained. It is 
important to note that the log-likelihood function L(£) is 
highly nonlinear. Therefore the starting values of the 
parameters cf must be selected with care. In order to improve 
the accuracy of estimation, a successive approximation 
method, namely using the parameters estimated from simpler 
models as starting values for more complex one, is used in 
this paper. 

C. Model Checking 

Under large sample, the distribution of the maximum 

likelihood estimation <f can be approximated by normal 
distribution: 



f-tf^-Htf,,)]- 1 ) (5) 

where cf is the truth values of the estimated parameters cf , 

H is a Hessian matrix. A consistent estimate of H(f ) can 

be obtained by evaluating dL(^) J d£d£ at t, . After 

calculating the variance off , the significance of estimated 
parameters can be tested using t-statistics. 

The Nyblom-statistic, holding the advantage that its 
asymptotic distribution only depends on the number of 
estimated parameters, is used to test the constancy of the 
proposed model [18]. The Nyblom-statistic W N can be 
expressed as: 
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given by: 
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(7) 



where S kt is the Ath element of S t , V kk is the Ath diagonal 
element of V . 

Cramer- Von Mises statistic can be used to test if the 
distribution of residuals is consistent with skewed student-t 

distribution. Let F N (z) denote the cumulative distribution 

function (CDF) of skewed student-t, F(z) denote the actual 

CDF of the residuals. Then Cramer- Von Mises statistic can be 
formulated as: 



w. 



CVM 



Y,{F N (z t )-F(z t )) 2 



(8) 



t=\ 



D. Forecasting Accuracy Evaluation 

Generally speaking, the electricity price forecasting model 
is one with time-varying parameters, and its parameters 
should be modified by the new available data in order to 
improve the forecasting accuracy. In this paper, the mean 
absolute percentage error (MAPE) is used to evaluate the 
forecasting accuracy. It can be calculated as follows: 



MAPE 



1 



E 

(-1 



Pt-Pt 



P, 



(9) 



i=i 



where p t and p t respectively refer to the forecasted and 

actual realized electricity prices at period t , and n is the 
period number to be forecasted. 

III. EMPIRICAL RESULTS 

The PJM is organized as a day-ahead market. Participants 
submit their buying and selling bid curves for each of the next 
24 hours. Then the market operator aggregates bids for each 
hour and determines market clearing prices and volumes for 
each hour of the following day. In this paper, A total of 1 1 97 
observations of average daily electricity spot prices in dollars 
per megawatt hour ($/MWh) and average daily loads in 
gigawatt (Gw) are employed to validate the performance of 
the multi-cycle st-GARCH-M model. The sample period 
begins on 1 Jun., 2007 and ends on 9 Sep., 2010. Table 1 
presents some descriptive statistics for the average daily 
electricity spot price and load series. It can be seen from Table 
1 that electricity prices and loads are quite volatile, highly 
abnormal, clearly skewed rightward, and with a median well 
below the mean. In fact the nulls of normality of electricity 
price and load series are rejected with the Jarque-Bera test. 
This is typical for electricity spot prices in a competitive 
market. 

By analysing the correlation coefficient, partial correlation 
coefficient and time trend chart of the sample data, the values 

of m,p,q,u,v,r h ,s h ,r ,s ,r x ,s x in the multi -cycle st-GARCH- 

M model can be identified. In our situation, they are equal to 
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52, 7, 3, 1, 0, 1, 1, 1, 1, 1, 1. Table 2 shows the results of the 
maximum likelihood estimation (except for the coefficients of 
intercept and conditional skewness, all other coefficients that 
are not significant at the 95% confidence interval have been 
removed). Investigating the data in Table 2, the following 
conclusions can be derived: 

TABLE I DESCRIPTIVE STATISTICS OF THE SAMPLE DATA 



Statistics 



Price($/MWh) 



Load(GW) 



Mean 


53.52041 


81.19221 


Median 


49.97068 


79.89221 


Maximum 


189.6557 


115.7839 


Minimum 


24.87494 


58.34586 


Std. Dev. 


20.20158 


10.50560 


Skewness 


1.420081 


0.375318 


Jarque-Bera 


1046.748 


36.78506 


(p-value) 


(0.0000) 


(0.0000) 



TABLE II ESTIMATION RESULTS OF GARCH-M-ST MODEL 



Parameters Estimated 



Std. 
Err. 



statistics 



P 
value 



Nyblom 
statistics 



a 


-1.3102 


0.7862 


-1.666 


0.0956 


0.0660 


«2 


-1.9915 


0.3278 


-6.074 


0.0000 


0.1313 


Vo 


0.9377 


0.0293 


32.006 


0.0000 


0.0758 


n 


-0.8899 


0.0268 


-33.146 


0.0000 


0.0595 


<h 


0.9424 


0.0125 


75.254 


0.0000 


0.0635 


<Pi 


0.0281 


0.0088 


3.195 


0.0014 


0.0697 


a n 


-0.4977 


0.1192 


-4.175 


0.0000 


0.0304 


"22 


-0.1629 


0.0446 


-3.653 


0.0003 


0.1020 


n 14 


-0.2069 


0.0801 


-2.583 


0.0098 


0.1778 


n 24 


1.0293 


0.1449 


7.105 


0.0000 


0.1045 


"124 


362.29 


3.1787 


113.97 


0.0000 


0.0341 


"224 


-234.40 


4.0323 


-58.130 


0.0000 


0.0346 


"152 


-81.715 


0.9176 


-89.056 


0.0000 


0.0516 


Q 252 


-94.889 


0.1796 


-528.44 


0.0000 


0.2991 


K \ 


-0.2360 


0.0318 


-7.426 


0.0000 


0.1541 


K 2 


-0.2530 


0.0321 


-7.874 


0.0000 


0.0899 


K i 


-0.1470 


0.0296 


-4.959 


0.0000 


0.2236 


e 


0.0639 


0.0315 


2.032 


0.0422 


0.1576 


A> 


0.2601 


0.1164 


2.236 


0.0254 


0.1292 


Ai 


0.8286 


0.0253 


32.814 


0.0000 


0.4837 


&1 


0.2099 


0.0375 


5.600 


0.0000 


0.5081 


So 


-2.0287 


0.3834 


-5.291 


0.0000 


0.1355 


Sn 


0.3812 


0.0907 


4.203 


0.0000 


0.0275 


*21 


0.0139 


0.0046 


3.036 


0.0024 


0.0530 


T 


0.5999 


0.1053 


5.699 


0.0000 


0.1360 


T n 


-0.0178 


0.0187 


-0.951 


0.3414 


0.1219 


T 2\ 


0.0001 


0.007 


0.078 


0.9380 


0.0252 


Maximum Log 


likelihood 


-2.7852 


MAPE 


6.308% 


Cramer- Von Mises Statistics 


0.1651 


Nyblom 
Statistics 


8.3773 



1 ) The MAPE 6.308% of the st-GARCH-M model is 
approximately equal to the proposed models in [11-16], but 
the number of estimated parameters is only 27, which is less 



than the number of the proposed models in [11-16]. To some 
extent this reduces complexity, improves the computing speed, 
and strengthens the practical application ability of the model. 

2) The t-statistics for a u ,(X 2i ,i G (2,4,24,52) are 

significant at the 99% confidence level. This shows that there 
are weekly, semi-monthly, quarterly and semi-annual cycles 
in the sample periods. The amplitudes of the peak for the 
weekly and semi-monthly cycles are larger than the others. 

3) The t-statistic for a 2 is significant at the 99% 

confidence level. This shows that the impacts of load on the 
average daily electricity prices for weekday and weekend are 
more different. 

4) The t-statistics for y and y l are significant at the 99% 

confidence level, indicating that d t has a marked impact on 

the average daily electricity prices. However, when d t is 

incorporated in the mean equation, the sign of OC 2 changes 
from positive to negative, showing that there are some 

Moreover, when 

2 



substitution effects between d wkd and d t 



d t is replaced by d t in the mean equation, the MAPE is 

reduced from 6.308% to 6.16%, indicating that the 
relationship of load and electricity price may be more 

accurately described by d t . 

5) The t-statistic for p n in the conditional variance 

equation is positive and significant at the 99% confidence 
level, indicating that the volatility of electricity price series 
are strongly persistent. The impacts of prior period volatility 
on current period volatility show a gradually weakening trend. 
As shown in Figure 1, clearly there are volatility clustering, 
demonstrating that high conditional variance is followed by 
high conditional variance. 




'V-..^^- — '\i\. 



% .. 



Jk 



Fig. 1 conditional heteroskedasticity of price series 

6) The t-statistic of j3 2l in the conditional variance 

equation is significant at the 99% confidence level, indicating 
that the volatility of electricity price series will be 

strengthened by external shocks. The sum of p n and p,, is 

close to 1 , indicating that there may be an integrated GARCH 
effect for the average daily electricity price series. The 
impacts of volatility of prior periods and external shocks on 
the current period volatility have longer persistence. 

7) The t-statistic for O is significant at the 95% 

confidence level, showing the conditional variance has a 
marked impact on the average daily electricity prices. When 
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keeping the other explanatory variables constant, the average 
daily electricity prices increase with the increase of the 
volatility of prices series. 

8) The t-statistics of O n and 21 are significant at the 

99% confidence level, indicating that the conditional degree 
of freedom manifests obviously time-varying features. The 
residuals and their squares exert a significant impact on the 
conditional degree of freedom of student-t distribution. The 
PDF of conditional degree of freedom is depicted in Figure 2. 
It can be seen that conditional freedom degrees are mainly 
between 2 and 8, indicating that obviously there are fat-tail in 
the electricity price series. 




Fig. 2 probability density of degree of freedom 




Conditional degree of freedom 
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Fig. 3 Conditional skewness of residuals 
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Fig. 4 Probability density of residuals 

9) The t-statistic of T is positive and significant at the 

99% confidence level, but the t-statistics of T n and T 21 are 

not significant at the 90% confidence level, indicating that in 
the sample periods the electricity price series are clearly 
skewed rightward, and can be described by homogeneous 
skewness. As shown in Figure 3, they are mainly between 0.2 
and 0.32. 



10) The Cramer- Von Mises statistic is less than the critical 
limit at the 99% confidence level, indicating that the skewed 
student-t distribution is fully consistent with the actual 
distribution of residuals, as shown in Figure 4. 

11) The Nyblom-statistics of all estimated parameters are 
less than the critical limit at the 99% confidence level, but the 
Nyblom-statistic for the whole model is slightly larger than 
the critical limit at the 99% confidence level, indicating that 
there are some instabilities for the above model. When 

removed d wkd or d t from the mean equation, the Nyblom- 
statistic of the whole model will be less than the critical limit 
at the 99% confidence level. One possible explanation is that 

there are some substitution effects between d wlid and d t . So 

the way to address the two influencing factors, d wkd 

and d t more reasonably, will be the main problem to be solved 
in the future research work. 

IV. CONCLUSIONS 

With comprehensive analysis of the basic features and 
influencing factors of electricity price, a multi-cycle st- 
GARCH-M model is proposed, in which the 
heteroscedasticity, skewness, kurtosis and fat-tail, and 
seasonalities of electricity price series are dealt with time- 
varying variance, time-varying skewness, time-varying degree 
of freedom and sinusoidal function respectively. The model 
holds the advantages of less computational cost and 
parsimonious scale of estimated parameters. Moreover, the 
time trend and the relationship between load and spot price 
can also be taken into account. The numerical example based 
on the historical data of the PJM electricity market shows that 
the conditional variances and system loads have significant 
effects on the average daily electricity prices, and there are 
volatility clustering and weekly, semi-monthly, quarterly and 
semi-annual periods, and the variance of electricity price 
series and the degree of freedom of skewed student-t 
distribution manifest the clear time-varying features. However, 
the substitution effect between d wM and d, in the mean 
equation may bring about some instability for the proposed 
model. So how to more reasonably address the relationship 
among load and spot price and further improve the goodness 
of fit of the proposed model is a relevant subject for future 
research work. 
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